Rank | Count | Beginning |
---|---|---|
1663 | 768 | De |
2651 | 596 | Den |
2037 | 337 | Déi |
199 | 322 | Am |
8491 | 296 | Si |
5680 | 295 | Et |
532 | 248 | An |
6469 | 245 | Hien |
6390 | 184 | Hie |
3418 | 157 | Dës |
5137 | 139 | E |
9289 | 133 | Vun |
1367 | 122 | Dat |
6023 | 121 | Fir |
5375 | 117 | Eng |
7591 | 105 | No |
7933 | 102 | Op |
5334 | 94 | En |
7320 | 91 | Mat |
9647 | 84 | Well |
113 | 82 | Als |
997 | 81 | Bei |
8320 | 80 | Se |
7832 | 77 | Och |
8392 | 76 | Seng |
8807 | 69 | Sou |
4885 | 66 | Duerch |
9532 | 62 | Wéi |
9888 | 62 | Zu |
7 | 59 | A |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV